o 



A functional clustering algorithm for the analysis of dynamic 

network data 

S. Feldt/'Q J. Waddell, 2 V. L. Hetrick, 3 J. D. Berke, 3 and M. Zochowski 1 - 4 

1 Department of Physics, University of Michigan, Ann Arbor, Michigan 48109, USA 
2 Department of Mathematics, University of Michigan, Ann Arbor, Michigan 48109, USA 



q \ 3 Department of Psychology, University of Michigan, Ann Arbor, Michigan 48109, USA 

^ ■ ^Biophysics Program, University of Michigan, Ann Arbor, Michigan 48109, USA 



(Dated: January 7, 2009) 

Abstract 

We formulate a novel technique for the detection of functional clusters in discrete event data. The 
advantage of this algorithm is that no prior knowledge of the number of functional groups is needed, 



as our procedure progressively combines data traces and derives the optimal clustering cutoff in a 
simple and intuitive manner through the use of surrogate data sets. In order to demonstrate the 
power of this algorithm to detect changes in network dynamics and connectivity, we apply it to both 

m 

' simulated neural spike train data and real neural data obtained from the mouse hippocampus during 

exploration and slow-wave sleep. Using the simulated data, we show that our algorithm performs 
better than existing methods. In the experimental data, we observe state-dependent clustering 

m 

patterns consistent with known neurophysiological processes involved in memory consolidation. 
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I. INTRODUCTION 



The detection of structural network properties has been recently recognized to be of great 
importance in aiding understanding of the properties of variety of man-made and natural 
networks [l, 2, 3, 4]. Here, however, two significantly different notions of network structure 
have to be identified. One is the physical (or anatomical) structure of the network. In 
this case, community structure refers to groups of nodes within a network which are more 
highly connected to other nodes in the group than to the rest of the network. Here, multiple 
techniques exist which utilize a knowledge of the network topology (adjacency matrix) to 
extract this hidden structure 

The other type of structure is the functional structure, which refers to a commonality 
of function of subsets of units within the network, generally observed by monitoring the 
similarities in the dynamics of nodes 

of physical connection between the network elements) is replaced with notion of functional 
commonality (or proximity), which can rapidly evolve based on the observed dynamics. 

While the physical connectivity of the network can be obtained for many man-made and 
some biological networks, there are large classes of networks where physical or anatomical 
structure can not be obtained. The brain is a prime example of such a system - the cortex 
alone contains around 1.5 x 10 14 tightly packed connections (synapses) and it is clearly im- 
possible to derive any detailed properties of its connectivity. It is not even completely clear 
that having such a detailed knowledge of the connectivity would be particularly useful in 
understanding brain function, as it significantly evolves during the life-time of an individual 
through such processes as neuronal loss, adult neurogenesis and constant rewiring (i.e. cre- 
ation, annihilation and modulation of synapses). Also, since it is known that brain function 
is distributed over large neuronal ensembles, or even more globally, between different brain 

es self-organize to gener- 
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modalities, it becomes imperative to understand how these ensemb 
ate desired functions (movement, memory storage/recall, etc.) fll . 
of techniques that allow the activity of many cells to be simultaneously monitored provides 
hope for a clearer understanding of these neural codes, but also demands novel tools for the 



detection and characterization of spatio-temporal patterning of this 
While it is assumed that these ensembles are formed dynamically 



activity. 
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spatio-temporal interactions of activity patterns of many individual neurons, the neural 
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correlates of cognition are not well understood. One of the most prominent hypothesis 
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addressing this issue is the temporal correlation hypothesis 
assumed that correlations between activity pattern of neurons mediate feature binding and 
thus formation of intermittent functional ensembles in the brain. Thus, functional clustering 
can potentially be reduced to the identification of temporally correlated groups of neurons. 

In order to successfully capture the (physical or functional) community structure of a 
network, a clustering algorithm should have two important properties: the ability to detect 
relationships between nodes in order to form clusters, and the ability to determine the 
specific set of clusters which optimally characterize the network structure. While some 
clustering methods have been designed to extract the structure directly from the dynamics 



of the neurons 
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26| . most methods rely on using a similarity measure to 



compute distances in similarity space between neurons, and then use structural clustering 



30, |3lJ. However, a major 



methods to determine the functional groupings [27|, [28|, [29, 
problem becomes identifying statistically significant community structures from spurious 
ones. To achieve this goal, current structural clustering techniques involve an optimization 



of the networ 



fly, 
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£ modularity 32j, |33J or require a prior knowledge of the number of communities 
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37] . 



In this paper, we develop a novel clustering method that does not depend on structural 
network information, but instead derives the functional network structure from the temporal 
interdependencies of its elements. We refer to this method as the Functional Clustering 
Algorithm (FCA). The key advantage of this algorithm is that it incorporates a natural 
cutoff point to cease clustering and obtain the functional groupings without an a priori 
knowledge of the number of groups. Additionally, the algorithm can be used with a variety 
of different similarity measures, allowing it to detect functional groupings based on multiple 
features of the data. Although this paper focuses on the application to neural data in the 
form of spike trains, the FCA can be applied to any type of discrete event data. 

The paper is organized as follows: we first introduce the Functional Clustering Algorithm, 
along with a novel similarity metric designed to detect co-firing events in neural data. We 
then compare the performance of the algorithm to two existing methods using simulated 
data, and show that it performs better than existing measures. Finally, we demonstrate 
the application of our new algorithm to experimental data exploring progressive memory 
consolidation in the hippocampus. 
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II. THE FUNCTIONAL CLUSTERING ALGORITHM 



Here we introduce the Functional Clustering Algorithm (FCA) which is tailored to detect 
functional clusters of network elements. The algorithm can be applied to any type of discrete 
event data, however, this paper will focus only on the application of the algorithm to neural 
spike train data. 

The FCA dynamically groups pairs of spike trains based on a chosen similarity metric, 
forming progressively more complex spike patterns. We will introduce a new similarity 
measure which is used for the data analyzed in this paper, but any pairwise similarity 
measure can be chosen. The specific choice of the metric should depend on the nature of the 
data being analyzed and the type of functional relationships which one chooses to detect. 

A general description of the FCA is as follows (see the subsequent sections for detailed 
descriptions and Fig. [I] for a schematic of the algorithm): 

1. We first create a matrix of pairwise similarity values between all spike trains. 

2. We then use surrogate data sets to calculate 95% confidence intervals for each pair- 
wise similarity. These significance levels are used to calculate the scaled significance 
between each pair of similarity values (see Sect. Ill Bl for the definition of scaled signif- 
icance) . 

3. The pair of trains with the highest significance is then chosen to be grouped together, 
and the scaled significance of this pair is recorded. A unique element of the FCA is 
that the two spike trains which are grouped together are then merged by joining the 
spikes into a single new train (see Fig. Ufa)). This allows for a cumulative assessment 
of similarity between the existing complex cluster and the other trains. 

4. The trains which are being joined are then removed, the similarity matrix is recalcu- 
lated for the new set of trains, new surrogate data sets are created, and a new scaled 
significance matrix is calculated. 

5. We repeat the joining steps (3 — 4), recording the scaled significance value used in each 
step of the algorithm until the point at which no pairwise similarity is statistically 
significant, indicating that the next joining step is not statistically meaningful. We 
refer to this step as the clustering cutoff (dashed red line in Fig. [I]). At this point, 
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the functional groupings are determined by observing which spike trains have been 
combined during the clustering algorithm. 

A key advantage of this algorithm is that the ongoing comparison of the similarity metric 
obtained from the data with that from the surrogates causes the algorithm to have a natural 
stopping point, meaning that one does not need an a priori knowledge of the number of 



functional groups embedded in the data. Gerstein et al [12y also developed an aggregation 
method based on grouping neurons with significant coincident firings, but this method re- 
sults in the formation of non-unique strings of related neurons as opposed to well defined 
functional groupings. We now discuss the details of the implementation of the FCA in the 
following sections. 



A. Average Minimum Distance 

For the data presented in this paper, we use a new similarity metric which we call the 
Average Minimum Distance (AMD) to determine functional groupings. The AMD is useful 
in capturing similarities due to coincident firing between neurons. Note that other metrics 
could be chosen, depending upon the nature of the recorded data. To compute the AMD 
between two spike trains Si and Sj, we calculate the distance At l k from each spike in Si to 
the closest spike in Sj as shown in Fig. CEfd). We then define 

J / J k 

where Ni/j is the total number of spikes in Si or Sj respectively. Finally, we define the AMD 
between spike trains Si and Sj to be 

AMDij = Bi+Eii. (2) 



B. Calculation of Significance 



In order to determine the significance between two trains, we create 5,000-10,000 surrogate 
data sets and calculate pairwise similarities for each surrogate set. The surrogate spike 
trains are created by adding a jitter to each spike in the train. This jitter is drawn from a 
normal distribution 



similar to the technique developed by Date 



39|. The method of 
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adding jitter to spikes (also known as dithering or teetering) to create surrogate data sets is 
commonly used when a nalyzing neural data and has been shown to eliminate correlations 



between spike timings 40j, |41[. Creating the surrogate trains in this manner preserves 



the frequency of each train while keeping the gross properties of the interspike-interval 
distribution. 

We examine the distribution of similarity values and create the cumulative distribution 
function (CDF) to determine the 95% level of significance. The scaled significance (Fig. [2] 
and Fig. [7]) is measured in units defined as the distance from the midpoint of the CDF to 
the 95% significance cutoff. Thus, a scaled significance value equal to one denotes the 95% 
significance level, and values higher than one are significant while values lower than one are 
deemed insignificant. 



III. COMPARISON TO OTHER ALGORITHMS 



In order to verify the performance of the FCA and compare it to that of existing clustering 
methods, we created simulated spike trains with a known correlation structure. Specifically, 
we created a set of 100 spike trains derived from a Poisson distribution that consist of four 
independent groups, 20 spike trains each, and 20 uncorrelated spike trains. The spike trains 
within these four groups are correlated (see Fig. [2|a)). To create the correlated groups, 
we first created a master spike train and used this train to create new trains by randomly 
deleting spikes from the master train with a certain probability. Thus, the resulting train 
was also a Poisson process with a firing rate dependent upon the deletion probability. The 
master train was 5000 time steps long, with each neuron spiking an average of 250 times 
during the duration of the train. To further randomize the timings of the spikes copied from 
the master train, we added jitter (drawn from a standard normal distribution) to the spike 
times. Each correlated group was composed of 20 trains from the same master. The firing 
rate of the independent trains was set to match that of the correlated trains. 

We first applied the FCA to the simulated data described above (Fig. [2]^b-c)). In Fig. 
[2]^b) we show the scaled significance at each joining step in the algorithm. The dashed red 
line marks the significance cutoff (single 95% confidence interval); points above this line are 
statistically significant, and the clustering cutoff is given by the point where the curve drops 
below this line. Fig. W^c) shows the resulting dendrogram with the dashed red line denoting 
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the clustering cutoff. The algorithm correctly identifies the 4 groups of neurons as well as 
the 20 independent neurons. 



A. Comparison to the Gravitational Method 

We then compared the performance of the FCA to that of the gravitational method 



23. 



24 



25|, 



26l |. This method performs clustering based on the spike times of neuronal firings 
by mapping the neurons as particles in N-dimensional space, and allowing their positions to 
aggregate in time as a function of their firing patterns. Particles are initially located along 
the trace of the N-dimensional space and given a 'charge' which is a function of the firing 
pattern on the neuron. The charge qi on a particle is given by 

q i {t) = Y,K{t-T k )-\ l (3) 

k 

where K(t) = exp(—t/r) for t > and K = otherwise, are the firing times of the 
neuron, and Aj is the firing rate of the neuron, normalized so that the mean charge on a 
particle is zero. The position vector, x, of the particle is then allowed to evolve based upon 
the following rule: 

Xj (t + dt) = Xj (t) + ndt QiQj-r^ — t (4) 

., . Xj Xj 

where k is a user defined parameter that controls the speed of aggregation. One then 
calculates the Euclidean distance between particles as a function of time and looks for 
particles which cluster in the N-dimensional space (i.e., the distance between the particles 
becomes small). 

Fig. [3] depicts the results of applying the gravitational method to the simulated data 
described above for cases of high correlation (C w 0.63) within groups (Fig. E(a,c)) and 
also for low correlation (C ~ 0.13) within clusters (Fig. |3^b,d)). In Fig. E^a-b) we plot 
the pairwise distances between particles as a function of time in the algorithm. Blue traces 
denote distances between intra-cluster trains, green between inter-cluster ones, and red 
between any train and an independent train. To visualize the results of the method, we 
have sliced these plots as indicated by the dashed vertical line and represent the distances at 
this point in time as matrices in Fig. [3]^c-d). While, for the case of high correlation between 
the spike trains, the algorithm separates the 4 groups correctly (black squares in Fig. E(c)), 
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one is unable to distinguish between inter and intra-cluster trains for the low correlation 
case. Furthermore, these plots must be visually inspected for the cutoff (i.e. time point 
at which they stabilize) and the clustering results may significantly depend on its position, 
as the algorithm has no inherent stopping point and the rate of aggregation is parameter 
dependent. Even then, the detection of the formed clusters may require the application 
of an additional N-dimensional clustering algorithm to detect the clusters formed in the 
N-dimensional space. Another drawback of this method is that as the particles aggregate 
into clusters, the clusters start interacting due to the nature of the algorithm, causing inter- 
cluster distances to become significantly lower than those with random trains, which does 
not match the correlation structure of the data. 

The FCA performed the correct clustering of the data for the case of the high correlation 
and only made an occasional error for data with the low correlation. 



B. Comparison to Complete Linkage and Modularity 

We next compare the performance of the FCA to a method which maps spiking dynamics 
onto a structural space and then uses a structural clustering method to determine functional 
groupings. The structural clustering method used is a standard hierarchical clustering tech- 
nique called complete linkage. Since this algorithm has no inherent cutoff point at which 
clustering is stopped, we combine it with a calculation of the weighted modularity [33( , which 
is a commonly used measure to determine the best set of groupings when dealing with hier- 
archical clustering methods. We have also tried other methods (single-linkage, GN algorithm 
fol I27I ) , but complete linkage gave the best results of the other methods attempted. Please 



281 ] for a review of standard hierarchical clustering techniques. 
The complete linkage algorithm again clusters trains based upon a similarity measure. In 
this algorithm, a similarity matrix is created and the elements with the maximum similarity 
are joined. However, the clusters are formed through virtual grouping of the elements and 
there is no re-calculation of the similarity measure; the similarity between clusters is simply 
defined to be the minimum similarity between elements of the clusters. For the data pre- 
sented in this paper, we use the absolute value of the normalized cross-correlation matrix 
as our similarity matrix, since this is what is commonly used to do examine community 
structure in neuroscience applications. To compute this matrix, spike trains are first con- 



8 



volved with a gaussian kernel and the signal is demeaned (the mean value of the signal is 
subtracted). The cross-correlation is given by 



C {S%, Sj 



C {Si, Sj) 



V C {Si, Si) ■ C {Sj, Sj 
where C is the linear cross correlation function 



(5) 



/oo 
S t (t) Sj (t) dt. (6) 
-oo 

Since the complete linkage algorithm has no inherent method of determining the clustering 
cutoff, we compute the (weighted) modularity [33] for each step of the algorithm. The 
modularity measure was originally tailored to detect the optimal community structure based 
upon structural connections between nodes (i.e. adjacency matrix), however it can also be 
used to detect optimal clustering based on not structural, but dynamical relations, where 
the adjacency matrix is substituted with the correlation matrix. The modularity is given by 



where is our similarity matrix, ki = ^2 • A^, m = \ J2ij Aij, and 5 (q, Cj) — 1 if i and j 
are in the same community and zero otherwise. The maximum value of the modularity is 
then used to define the clustering cutoff. 

The complete linkage dendrogram is shown in Fig. IH(b) and the modularity for this 



clustering is 



modularity 32 



otted in Fig. H^a). The clustering cutoff is defined as the maximum of the 



331 ] , however the scaling of the modularity, even in this simple case, provides 



ambiguous results. The numerical maximum of the modularity is observed for the clustering 
step marked by the dashed red line in Fig. H]- significantly above the clustering step that 
starts linking random spike trains. Even if we relax this definition and assume that the 
set of high modularity values is equivalent, the exact location of the cutoff is ambiguous as 
shown by the area enclosed in the transparent red box. Note that the FCA does not have 
this ambiguity, as the cutoff is quite clear and the algorithm correctly identifies the groups 
embedded in the spike train data. 

To further explore the performance of the FCA in comparison with complete linkage and 
modularity, we monitor the performance of both methods for progressively lower correla- 
tions within the four clustered groups (Fig. [5]). We did not perform this analysis for the 
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gravitational method since that algorithm has no predetermined stopping point and clus- 
ter identification must be assessed by the user. As before, the inter-cluster correlation is 
controlled through progressive, random deletion of spikes from a master train. In order to 
compare the performance of the two algorithms, it is necessary to compare the obtained 
clusterings to the known structure of the data. To assess the correctness of the retrieved 
clusters as compared to the actual structure of the network, we calculate the normalized mu- 
tual information (NMI) g), I42I as a function of the average correlation within the constructed 
groups. The NMI is a measure used to evaluate clustering algorithms and determine how 
well the obtained clustering, C, matches the original structure, C . To compute the NMI, 
one first creates a matrix with c rows and d columns, where c is the number of communities 
in C and d is the number of found communities in C . An entry, iVy, is defined to be the 
number of nodes in community i that have been assigned to the found community j. If we 
denote ]V iy y = Ylj/i ^ij an d N = ^ij then we can define 



This measure is based on how much information is gained about C given the knowledge of 
C . It takes a minimum value of when C and C are independent, and a maximal value of 
1 when they are identical. 

In Fig. Owe use the NMI to compare the obtained clustering with the known structure of 
the simulated data. As shown in the figure, complete linkage and modularity consistently fail 
to identify the correct structure. This is because the maximum of the modularity occurs for 
a point in the algorithm where various independent spike trains have been joined, creating 
erroneous group structure. However, the FCA correctly identifies neurons for almost all 
values of correlation. Please note that the 80% level of correctness using complete linkage 
and modularity for higher intercluster correlation values is due to the fact that we had only 
24 independent groups (20 spike trains + 4 independent clusters) in the tested network. A 
higher number of independent neurons would lead to a poorer performance of that method 
(due to the erroneous grouping of independent neurons) and thus higher relative effectiveness 





of the FCA. 
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IV. APPLICATION TO EXPERIMENTAL DATA 



In order to show possible applications of the FCA to real data, we examined spike trains 
recorded from the hippocampus of a freely moving mouse, using tetrode recording methods 



431 ] . All animal experiments were approved by the University of Michigan Committee on 



the Use and Care of Animals. In this report, we focus on the population of pyramidal 
neurons (77 total; by subregion: 42 CA1, 21 CA2, 14 CA3). While recording this cell 
population, the mouse was placed in a novel rectangular track environment. The mouse 
initially explored the environment by running approximately 20 laps, then settled down, 
and shortly thereafter fell asleep. A raster plot of this data is shown in Fig. EJ This data 
set is of interest for two reasons. Firstly, there are established differences in the functional 
organization of hippocampal networks between active exploration and slow- wave sleep 44]. 



These include the joint activation of pyramidal cell ensembles at timescales corresponding to 
gamma fluencies during awake _nt Q , a nd the high speed replay of pyramidal cell 
sequences within ripple events that occur preferentially during slow- wave sleep and rest [46J . 
Secondly, the mouse learned a new spatial representation during exploration of the novel 
environment (as indicated by the formation of "place fields" [43]]) and the subsequent epoc 



of slow-wave sleep has been hypothesized to be a period of memory consolidation 47 



48 



that is presumed to involve alterations in structural and thus functional network connectivity. 
These structural alterations involve the strengthening of existing monosynaptic connections 
between the neurons. Furthermore, recent experimental findings have shown that memory 
consolidation of the neural representation of novel stimuli results in two changes: neurons 
that are correlated during initial exposure progressively increase their co-firing, while the 
neurons that have shown a loose relation become further de-correlated 49] . In terms of 



network reorganization, this should lead to the tightening of the cluster of cells involved in 
the coding of the new environment and, at the same time, a functional decoupling from the 
other cells. 

Given these functional differences between the various behavioral states of the mouse, we 
expected to see different clustering patterns during the exploration and sleep phases, due to 
the known differences in network dynamics between these behavioral states. 

In Fig. Et^a) we show the scaled significance used in the FCA during the initial exploration 
as well as the first sleep period. For this data, the jitter amount added to surrogates was 
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drawn from a normal distribution with a standard deviation of 10s to destroy long term 
correlations between neurons. The cutoff point in the algorithm occurs when the scaled 
significance drops below the dashed red line. The step in the algorithm at which this cutoff 
occurs indicates the number of neurons involved in the clustering. Thus if a cutoff occurs 
for a late (as opposed to early) step in the clustering, more neurons are recruited into the 
clusters. One can see that there is an increase in the number of significant pairs being 
clustered during the sleep period (due to the later stage of cutoff), consistent with the 
increased co-activation of neurons known to occur during sleep ripples. 

We then compared the initial exploration of the novel environment to a subsequent ex- 
ploration of the same environment (after the sleep epochs). Here, we hypothesized that, due 
to memory consolidation and the associated changes in correlations between neurons, we 
would observe a selective drop in the joining AMD when comparing the initial exposure to 
a novel environment to a subsequent exposure once the environment has become familiar. 
This drop should occur for initially small AMD values (initially correlated neurons) as these 
neurons become further correlated. However, for initially large (insignificant) AMD values, 
we expect an increase in the AMD values when comparing novel and familiar exploration. 
This growth occurs as the neurons with low correlations become further uncorrelated. 

To assess any changes in the AMD values between initial (novel) and familiar exploration, 
we must introduce a frequency correction during the calculation of the D^ values (note that 
the effect of spiking frequency in the measure is accounted for in the algorithm through the 
comparison to surrogate data). Here, we normalize these distances by the average expected 
distance obtained from uniformly distributed spike trains having the same spike frequency: 
^j/ji = + 1)' where AT is the train length. Thus, 

~ _ Dij/ji /q\ 

D ■ ■ i . 

We then define the AMD between trains Si and Sj to be 

AMDiS = Dij + Dji . (10) 

Lower values of AMD indicate tighter functional clustering between the cells. 

In Fig. [7(b) , we show changes of the average AMDs used to cluster the neurons for the 
clustering steps which have a significantly lower AMD than that obtained from surrogates 
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(i.e. co-firing cells), during novel exploration and a subsequent familiar exploration. We 
indeed see that the average AMD value is lower for neurons during the familiar exploration 
indicating that the firing patterns of the neurons are more tightly correlated. Thus, as in 
the case of 4{j], the observed decrease of the AMD during the subsequent presentation of 
the novel environment occurs for neurons which fire in the same spatial locations of the 
maze. In Fig. [3(c), we show the average AMD distances for the non-significant clustering 
steps during the novel and familiar exploration. These distances are greater during the 
familiar exploration as the activity of the neurons having low correlation becomes even less 
correlated. 



V. CONCLUSIONS 



In conclusion, we have developed a new Functional Clustering Algorithm to perform 
grouping based on relative activity patterns of discrete event data sets. We applied this 
algorithm to neural spike train data, and have shown that the new algorithm performs 
better than existing ones in simple test cases, using simulated data. Additionally, we showed 
that the algorithm successfully detects state-related changes in the functional connectivity 
of the mouse hippocampus. Functional Clustering should therefore be a useful tool for the 
detection and analysis of neuronal network changes occurring during cognitive processes 
and brain disorders, as well other dynamical biological/physical phenomena that can be 
represented by discrete time series. 
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correlation within clusters (b). Blue traces: intra-cluster distances, green traces: inter-cluster 
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FIG. 4: (Color online) (a) Modularity calculation for the clustering obtained using complete linkage. 
The transparent red box marks the ambiguous cutoff area, (b) Dendrogram indicating clustering 
by complete linkage. Here the clustering cutoff is ambiguous and the algorithm fails to identify the 
appropriate structure. 
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FIG. 5: (Color online) Normalized mutual information as a function of average group correlation. 
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FIG. 7: (Color online) (a) The scaled significance used in clustering calculated for novel exploration 
(0 — 200s) and the first sleep period (900 — 1100s). The significance cutoff is shown by the dashed 
line. The FCA is able to detect the greater number of neurons involved in joint firing known 
to occur during sleep, (b) Comparison of the AMD averaged over significant clustering steps 
from novel exploration and a subsequent familiar exploration. We observe a decrease in this value 
during the familiar exploration as correlations between neurons become tighter, (c) Comparison 
of the AMD distances averaged over non-significant clustering steps during novel and familiar 
exploration. Here we see an increase in this value during familiar exploration as neurons which 
were uncorrelated become further de-correlated. 
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